Dew Math for .NET
|
Submits the Kernel to cmdQueue for computation with specified WorkSize and LocalSize.
When specified explicitely the (WorkSize mod LocalSize) is required to be zero. Setting CPUAdjust to true will reduce WorkSize by factor OPENCL_BLOCKLEN and assume presence of kernel internal for-loops. Kernel internal for-loops can significantly speed up execution of the kernel on CPU devices lowering the function call overhead. Kernel internal for-loops in GPU devices cause large performance penalties. The CPUAdjust parameter should be used only if the device is of CPU type. LocalSize is also called workgroup size. The for-loop pattern expected inside the kernel looks like this:
where BLOCK_LEN matches OPENCL_BLOCKLEN.
Copyright (c) 1999-2024 by Dew Research. All rights reserved.
|
What do you think about this topic? Send feedback!
|